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Description 

This invention relates to a method for distinguish- 
ing a first chemical compound from a second chemi- 
cal compound on the basis of chromatographic data 
wherein said chemical compounds absorb ultraviolet 
radiation, according to the prior art portion of claim 1. 
(see Paper No. 892. presented by H.-J.P. Sievert at 
the 39th Pittsburgh Conference and Exhibition. 
1988). 

There can be little doubt that mixtures of chemi- 
cal compounds have achieved great importance in 
modern society. The nature and operation of such 
mixtures are of frequent concern in fields such as 
agriculture, manufacturing, scientific research, and 
medicine. Indeed, the human body could scarcely 
function in the absence of chemical mixtures. Accord- 
ingly, it is frequently an object of medicine and other 
arts to determine the Identity and concentration of the 
components in chemical mixtures found, for example, 
in the human body or other chemical reaction sys- 
tems. Analysis of this sort finds numerous applica- 
tions and provides the primary basis for a wide variety 
of product quality control programs and medical diag- 
nostic techniques. 

Probably the most comnfK>n method for analyzing 
a mixture of one or more chemical compounds entails 
isolating and then characterizing each compound. 
Chromatography provides one means for effecting 
such isolation. In virtually all chromatographic sepa- 
rations, a mobile phase comprising a mixture of 
chemical compounds passes through a stationary 
bulk phase. Gas and liquid chromatography provide 
examples of techniques in which gases and liquids, 
respectively, are employed as the mobile phase. A 
number of variations on both gas and liquid chroma- 
tography are known in the art. The choice of a given 
variation depends intimately upon the particular ser>- 
aration to be perfonmed. For example, high- 
performance liquid chromatography (HPLC), a tech- 
nique in which a liquid mobile phase is passed 
through the stationary phase under the influence of 
high pressure, finds particular use in the separation 
and analysis of difficultly separated compounds hav- 
ing relatively high molecular weight. 

Compounds separated by HPLC or other types of 
chromatography are generally then passed through a 
detector responsive to one or more of the compounds. 
Flame ionization, thermal conductivity, and ultravio- 
let(UV)Msible devices provide examples of common- 
ly-employed detectors. As will be appreciated by 
those skilled in the art. ultraviolet detectors measure 
the degree to which a given chemical species absorbs 
electromagnetic radiatbn having wavelength be- 
tween about 200 and about 400 nanometers (nm). 
Those of skill in the art will also recognize that a de- 
tector's positive response to a chemical compound Is 
commonly referred to as a peak. A detector's re- 



sponse to each isolated componentof a chemical mix- 
ture is often recorded, such as on paper or magnetic 
media. Arecorded sequential assemblage of peaks is 
known in the art as a chromatogram. 

5 A mbcture of chemical compounds will commonly 

produce a chromatogram somewhat characteristic of 
that mixture. However, the particular chromatogram 
produced by a given chemical mixture will be greatly 
dependent upon the conditions under which said 

10 chromatogram is generated. As will be appreciated by 
those skilled in the art. factors which may influence a 
chromatogram include the solvent employed as an 
eluent, the pressure employed in the chromatograph- 
ic system, the type of stationary phase used, and the 

15 nature of chromatographic apparatus itself. 

Because a chromatogram is to a certain degree 
characteristic of a mixture of chemical compounds, 
chromatograms are often compared In order to distin- 
guish one such mixture from another. For example, 

20 retention times derived from a chromatogram provide 
one basis for such distinction. Retention times repre- 
sent the time intervals required for the isolatton and 
detection of the Individual chemical components of a 
mixture subjected to chromatographic analysis and 

25 are measured from the start of the analysis. The 
height and area of individual peaks provide additional 
bases for comparison two chemical mixtures. Com- 
parative analysis on the basis of such data will under- 
standably be complex where analyzed mixtures com- 

30 prise many individual compounds and will be further 
complicated by variations in the conditions under 
which subject chromatograms are generated. Thus, 
the results of such analyses often can only be consid- 
ered unambiguous when combined with other inde- 

35 pendent analytical methods. 

Accordingly, the analysis of chromatographic 
data is frequently combined with or supplanted by 
other techniques. One such technique invokes meas- 
uring the response of isolated chemical compounds 

40 upon exposure to one or more frequencies of infrared. 
UV. visible, or other forms of electromagnetic radia- 
tion. It is known, for example, that ultraviolet spectral 
data can provide structural information regarding 
compounds that have been separated on an HPLC 

45 system. Unfortunately, however, the interpretation of 
UV spectral data is often more difficult than interpre- 
tation of. for example, infrared spectral data. This dif- 
ficulty can be compounded by the fact that the ana- 
lysis of spectral data had traditionally been based on 

50 visual evaluation and comparison of spectra selected 
during elution of a mixture. These comparison tech- 
niques for UV spectra traditionally utilized only a few 
points in the spectral profile to validate identification. 
However, the fairly recent introduction of fult- 

55 spectrum photo diode-array ultraviolet detectors has 
significantly altered traditional UV spectral analysis. 
Diode-array spectrophotometers yield on-line spec- 
tra and allow rapid collection of spectra over the ul- 



2 



3 



EP 0 437 829 B1 



4 



traviolet and/or visible range in digital form. These in- 
struments, when Interfaced with HPLC systems, pro- 
vide a powerful tool for the analysis of complex mix- 
tures that are not amenable to gas chromatography or 
other types of separations. For example, those skilled 5 
in the art will appreciate that when the composition of 
a liquid chromatography mobile phase is varied for 
the same chemical mixture, the order in which its con- 
stituent compounds elutef rom a chromatographic ap- 
paratus can and often does change. The order in io 
which peaks associated with these compounds are 
recorded will, in turn, correspondingly vary. In order 
to identify peaks of interest it is vital that the peaks be 
tracked as their elution is varied by the solvent. In 
principle, the use of a diode-array detector can pro- is 
vide this facility. 

Diode-array ultraviolet detectbn. however, is not 
without Its limitations. For example, peaks can and of- 
ten do overlap and respective UV spectra are some- 
times insufficiently different to provide unique iden- 20 
tif Ication. In addition, because diode-array detectors 
commonly generate large amounts of Information 
from a single chromatographic analysis, manual and 
Interactive data reduction methods can prove time 
consuming and are often Incomplete and imprecise. 25 
Consequently, the developmentof diode-array devic- 
es has hastened the development of mathematical 
techniques for analyzing UV spectral data. Such 
mathematical methods can be used to extend the use 
of diode-array data by the deconvolution of peaks and 30 
by using pattern recognition techniques. 

Thus, a great deal of attention in the art has been 
directed to the implementation of diode-array UV de- 
tectors in the analysis of chemical compounds and 
mbctures of chemical compounds. The goal of nearly 35 
all such techniques has been to determine the identity 
of an unknown compound by comparing Its spectral 
data against vast libraries of similar data for known 
compounds. Identificatton techniques following this 
format are known as forward searches. 40 

It would be of great utility, however, to also per- 
form reverse searches of spectral data to identify a 
predetermined number of known components that 
are expected to be present in an unknown sample or 
to distinguish dissimilar compounds or mixtures. Re- 45 
verse search spectral analysis could be employed In 
areas such as the quality control of manufactured 
chemicals where it Is required that certain compo- 
nents be present in a given sample and the presence 
of additional components is undesirable, even critical. so 

SUMMARY OF THE INVENTION: 

It is an object of this invention to provide a method 
and apparatus for distinguishing two mixtures of 55 
chemical compounds. 

Another object of this invention is to provide a 
method and apparatus for distinguishing two mix- 



tures of chemical compounds on the basis of chroma- 
tographic data. 

Yet another object of this invention is to provide 
a method and apparatus for distinguishing two mix- 
tures of chemical compounds on the basis of spectral 
data. 

Still another object of this invention is to provide 
a method and apparatus for distinguishing two such 
mixtures by isolating and comparing their respective 
constituent chemical compounds. 

It is a further object of this invention to provide a 
method and apparatus for distinguishing two chemh 
cal compounds on the basis of chromatographic and 
UV spectral data. 

Accordingly, this invention provides a method 
and apparatus for distinguishing a first mixture of 
chemical compounds from a second mixture of chenrv 
ical compounds by analyzing chromatographic and 
spectrophotometric data associated with chemical 
compounds isolated from the mixtures. The method 
and apparatus provide spectral match factors and 
peak scores which correlate the chemical com- 
pounds. These match factors and peak scores are 
then employed in calculating sample scores indica- 
tive of the similarities between the mbctures. 

In a preferred embodiment, the method compris- 
es the steps of Isolating the chemical compounds of 
the first and second mbdures using chromatography; 
exposing each isolated chemical compound one or 
more times to one or more selected wavelengths of 
ultraviolet radiation; and recording the respective ab- 
sorbances of the isolated chemical compounds upon 
each exposure to the ultraviolet radiation. The re- 
spective absorbances of the isolated chemical conrv 
pounds are then provided to processing means as a 
first data set. Further steps performed by the proc- 
essing means Include providing at least one general 
match factor by applying a general matching function 
to the first data set; provkling respective average at>- 
sorbances for the isolated chemical compounds at 
each selected wavelength by applying an averaging 
function to the first data set; providing automatch fac- 
tors by applying an automatching function to the first 
data set and to the average absorbances; providing 
crossmatch factors by applying a crossmatching 
function to the first data set and to the average absor- 
bances; and providing match discriminaters by apply- 
ing a match discrimination function to the general 
match factors. A second data set is then provided to 
the processing means, said second data set compris- 
ing the respective retention times, peak areas, and 
peak heights for the isolated chemical compounds. 
Subsequent steps performed by the processing 
means include providing retention deviations by ap- 
plying a retention deviation function to the second 
data set; providing peak area deviations by applying 
a peak area deviation function to the second data set; 
providing peak height deviations by applying a peak 
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height deviation function to the second data set; pro- 
viding area and height deviations by applying an area 
and height deviation function to the peak area devia- 
tions and the peak height discriminaters; assigning 
peaks by applying a hierarchical assignment proce- 5 
dure; providing at least one peak score for the isolated 
chemical compounds by applying a peak scoring 
function to the match deviations, retention deviations, 
and area and height deviations; providing at least one 
sample score by applying, via the processing means, io 
a sample scoring function to the peak scores; and dis- 
tinguishing the first mixture of chemical compounds 
from the second mixture of chemical compounds on 
the basis of at least one sample score. 

IS 

BRIEF DESCRIPTION OF THE DRAWINGS: 

The numerous objects and advantages of the 
present invention may be better understood by those 
skilled in the art by reference to the accompanying 20 
figures of which: 

Figures 1a-c provide two wavelength-shifted ab- 
sorbance plots for the same chemical compound and 
a plot of general match factor versus wavelength. The 
figures illustrate wavelength shift and its correction 25 
by analysis of general match factors. 

Figure 2 is an HPLC chromatogram of r-hGH 
separated with gradient I. 

Figures 3a-c illustrate moderate spectral match 
between two tryptic peptides. Figure 3a shows the 30 
UV spectra for the two peptides. Figure 3b shows the 
distribution arising from plotting pairwise absorbance 
values for both peptides at identical wavelengths, and 
Figure 3c shows a comparison of the match factors for 
all spectra for the two peptides. 35 

Figures 4a-c illustrate strong spectral match be- 
tween two tryptic peptides. Figure 4a shows the UV 
spectra for the two peptides. Figure 4b shows the dis- 
tribution arising from plotting pairwise absorbance 
values for both peptides at identical wavelengths, and 40 
Figure 4c shows a comparison of the match Actors for 
all of the spectra for the two peptides. 

Figures 5a and 5b illustrate background correo 
tion for peak spectra for a tryptic peptide. Figure 5a 
shows the comparison of uncorrected upslope, down- 45 
slope and apex spectra for the peptide peak with a 
standard spectrum. Figure 5b presents the same 
spectra after background corrects n had been ap- 
plied. 

Figure 6 illustrates reproducibility of the tryptic 50 
map analyzed with gradient II. The figure shows the 
superimpositton of four replicate elution profiles. 

Figure 7 provides a table of standard deviations 
for retention time, peak area, peak height, and match 
factor for tryptic digests from r-hGH. 55 

Figure 8 provides tables illustrating the similarity 
between replicate samples of (a) tryptic digests from 
r-hGH analyzed with gradient I and of (b) native and 



oxidized tryptic digests from r-hGH analyzed with 
gradient II. 

Figure 9 is an HPLC chromatogram of the tryptic 
map for oxidized r-hGH analyzed with gradient II. The 
elutbn position for the unoxidized peptides is indicat- 
ed by arrows. 

Figure 10 is a flowchart illustrating the Make- 
Library subprogram. 

Figure 11 is a flowchart illustrating the Compare- 
Libs subprogram. 

Figure 12 is a flowchart illustrating the Make-Std- 
Library subprogram. 

Figure 13 is a flowchart illustrating the Get-Sam- 
ple-Score program. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENTS: 

The principles and methods of the present inven- 
tion are applicable to a number of situatk)ns relating 
to the comparison of individual chemical compounds 
and mixtures from which they may be derived. Thus, 
it will be appreciated that the present Invention may 
be practiced in situations where the identities of both 
of the compared species are unknown or, preferably, 
in situations where the identity of one species is well- 
known and that of the other is unknown. It is particu- 
larly preferred that a library of calibration data for one 
species be available. It is also preferred that chroma- 
tographic data relating to both compared species be 
available. Chromatographic data includes retention 
times, peak areas, and peak heights. 

In accordance with the present invention, mix- 
tures of chemical compounds are first isolated into 
their respective components. A preferred means of 
effecting such isolation is through the employment of 
chromatography. Any form of chromatography might 
conceivably be employed in the practice of this inven- 
tion, although liquid chromatography is preferred. It is 
particularly preferred that high-perfonnance liquid 
chromatography (HPLC) be employed In isolating the 
chemical compounds of mixtures to be analyzed in 
accordance with the present Invention. 

Once isolated, chemical compounds are exposed 
one or nrare times to one or more selected wave- 
lengths of ultraviolet radiation and the respective ab- 
sorbances of the Isolated chemical compounds upon 
each exposure is recorded. Those skilled in the art 
will appreciate that the reliability of data derived from 
such exposure will increase with the number of times 
such exposure is effected and with the number of wa- 
velengths employed. 

Once recorded, such data Is provided to process- 
ing means. Processing means amenable to the prac- 
tice of this invention consist of a computing device 
such as the HP9000 Series 300 Pascal Workstation 
or any equivalent computing device capable of com- 
piling and executing instructions. These instructions 
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should be provided in a programming language such 
as Pascal or any equivalent thereof capable of imple- 
menting the algorithms of this invention. Processing 
means further include an input device such as a key- 
board and an output device such as a video display 
or printer. Preferred processing meansfurther include 
one or nnore devices for the storage of data, such as 
magnetic disks or tape. Processing means should 
also comprise a operating system or programming en- 
vironment for the generation of source code in the ap- 
propriate programming language, along with a com- 
piler or other means of converting such source code 
into executable programs. 

Data may be provided to the processing means 
precisely as recorded or may be prepared or pretreat- 
ed by various means well known to those skilled in the 
art. Examples of such preparation or pretreatment 
are wavelength calibration, smoothing, and transfor- 
mation of the data, such as fast Fourier transform. 

In accordance with the present invention, algo- 
rithms implemented by the processing means are pro- 
vided. In certain preferred embodiments, these algo- 
rithms concern the problems encountered in identify- 
ing components in an HPLC separation based on the 
spectral and chromatographic data available from 
well characterized calibration standards. One such al- 
gorithm concerns the determinatk>n of spectral match 
factors. Thus, the spectral matching function may be 
defined as: 

MF, = 1000(1 -r2) (1) 
where MFg stands for spectral match fector and r is a 
correlation coefficient according to: 
, = [(Sxy) - (SxXXyynf] 

[{1x2 _ (Ex)2/nt}{Sy2 - (5:y)2/n^]i« ^ ^ 
where x and y, respectively, are absorbances taken 
from the compared spectra at the same wavelength, 
£ is the summation function, and nf is the number of 
selected wavelengths. It will be understood by those 
skilled in the art that other spectral matching func- 
tions, such as: 

MFs = 1000.r2 (3) 
can be employed in the practice of this invention. 

Spectral match factors can range from zero for a 
perfect nr^tch to 1 000 for total absence of con-elation. 
General match factors, automatch factors, and cross- 
match factors provide examples of spectral match 
factors. For example, in determining general match 
factors (MFg), r is the correlation coefficient obtained 
from the correlation between absorbances of individ- 
ual spectra for a first and a second chemical conv 
pound. 

As will be appreciated by those skilled in the art, 
one problem with the general match factor thus de- 
scribed is the lack of a meaningful limiting value for 
the differentiation between a positive and a negative 
match. Accordingly, one embodiment of the present 
invention provides such a limit. 



After multiple copies of spectra are obtained for 
a first and second chemical compound, the spectral 
match factors for certain selected matches are conn- 
pared. For example, the nnatch factors for all matches 

5 of individual spectra for the first compound are conrv- 
pared with the average spectrum for that compound. 
In addition, the match factors for matches of all indi- 
vidual spectra for the second compound are conrt- 
pared against the corresponding average spectrum 

10 for that compound. These comparisons of individual 
versus average spectra are known in accordance 
with this invention as automatching functions and the 
match factors so obtained are known as automatch 
factors (Ma). 

15 In accordance with one embodiment of the pres- 
ent invention, crossmatch factors are next obtained 
by matching: 1)all individual spectra for the first com- 
pound against the average spectrum for the second 
compound; and 2) all individual spectra for the sec- 

20 ond compound against the average spectrum for the 
first compound. The match factors obtained by conf>- 
paring the individual spectra of one compound with 
the average spectrum for the other compound are 
known as crossmatch factors (MJ. 

25 The well-known Student's t-tesl is employed in 

analyzing the results from automatching and cross- 
matching. Application of the t-test in this invention 
yields a difference (D) between the mean values for 
the automatch factors and the crossmatch factors. 

30 The t-test also provides a probability that this D-value 
is significant, i.e. that the two means are different. 
Where these means are different, the first and sec- 
ond compounds can reliably be said to represent dif- 
ferent species. 

35 In accordance with one embodiment, a match 
discrimination function may also be defined as fol- 
lows: 

MTdis = Dn'(DF,prob) (4) 
where MTdis is match discriminator, D is the differ- 

40 ence for the mean match factor derived from the au- 
tomatching and crossmatching functions, DF is the 
degrees of freedom which are calculated from the 
number of individual spectra for the first and second 
compounds, and T(DF,prob) is the t-value required for 

45 a desired degree of probability (prob, in %) that two 
means differing by that t-value are different given the 
degrees of freedom applicable. It is preferred that the 
degree of probability be 99%. As will be appreciated 
by those skilled in the art, MT^ depends on a number 

50 of factors, such as the number of spectral data points 
employed, the noise present in the individual spectra, 
any pretreatment applied to the spectra before 
matching and, of course, the degree of similarity be- 
tween the two compounds compared. Of course, 

55 where MTdb is equal to one (1 ) the actual probability 
that the first and second compounds are different will 
be equal to the desired probability. Where MTdis is 
less than one (1), the actual probability will be less 
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than the desired probability; where MT^is is greater 
than one (1). the actual probability will be greater 
than the desired probability. 

In accordance with this invention, it is further In- 
tended that a f bced MTdu be derived for a given stan- 5 
dard. Such derivation will permit testing for the signif- 
icance of an individual match between a standard and 
an unknown spectrum without the need for a conv 
plete statistical analysis. 

Where a given spectral match factor is not equal io 
to zero, another value indicative of the quality of that 
match factor can be obtained by analysis of the resid- 
ual resulting from the correlation between two spec- 
tra. It will be appreciated by those skilled In the art 
that If a "best-fit" regression line Is calculated for the is 
conrelation between any two spectra X and Y such 
that as one attempts to predict absorbance values for 
spectrum Y from the correlated value of spectrum X 
for each wavelength recorded, then the residual at 
each wavelength is a positive or negative difference 20 
between the actual absorbance of spectrum Y and 
the value of specimen Y as predicted from the corre- 
lation with spectrum X. When studied as a function of 
increasing wavelength, residuals tend to fluctuate 
above or below zero (0). 25 

If the two spectra differ in a systematic fashion, 
the residuals will tend to migrate across the regres- 
sion line only slowly. If, on the other hand, the resid- 
uals are distributed around regression line in a ran- 
dom fashion, that same match factor might stilt indi- 30 
cate spectral match, obscured only by noise. Thus, in 
accordance with one embodiment of this invention, a 
crossover number (CN) Is defined as follows: 

CN = C/(N - 1) (5) 
where C is the number of times the residuals change 35 
sign when sorted by increasing wavelength and N is 
the number of spectral data points used for the match. 
It will be understood that the maximum value for CN 
is one (1) and that CN can never quite reach Zero (0). 
Higher values for CN will indicate a likelihood that the 40 
deviation from zero (0) for a given spectral match fac- 
tor is due to random noise and not to systematic dif- 
ferences in the spectra compared. It will also be ap- 
preciated by those skilled in the art that the crossover 
numbers described can also be derived if spectra X 45 
and Y are exchanged. In this manner, one might ob- 
tain slightly different values which nonetheless exhib- 
it the same characteristics. 

Since the correlation procedure employs absor- 
bance values at identical wavelengths, the compari- so 
son of spectra having an error in wavelength can lead 
to erroneous match factors. Thus, it Is particularly 
preferred in determining spectral match factors that 
the wavelength assignment for the two spectra conn- 
pared be accurate. One means for providing accurate 55 
wavelength assignments is by acquiring spectra for 
the same standard under conditions ~ such as mobile 
phase, column, hardware calibration, and instrument 



-Identical to those employed in obtaining the two 
spectra in question. Such acquisition might be ach- 
ieved by use of an internal standard. 

Standard spectra thus acquired can then be used 
to calibrate other, related spectra. As will be appreci- 
ated by those skilled in the art the acquired standard 
spectra can be used to experimentally determine the 
difference in wavelength assignment by analyzing 
the spectral match factors for the two standard spec- 
tra as a function of a fractional wavelength shift to the 
left or right of one spectrum against the other. As 
shown in Figure 1, the maximum match factor should 
be obtained at a wavelength shift necessary to cor- 
rect for any wavelength Inaccuracy between the two 
unknown spectra. While each UV absorbance can be 
utilized at its nominal, absolute value, correlation can 
optionally be performed in accordance with one enr>- 
bodiment of this invention by Inversely weighting 
each absorbance value by the variance known to be 
associated with the wavelength at which it was ob- 
tained. Such procedure could improve the reproduci- 
bility of the matching process of weighing less heavily 
those regions of the spectrum known to be unreliable. 

It will be appreciated by those skilled In the art 
that chemical compounds can be distinguished for 
certain purposes by employing general match factors, 
automatch factors, and crossmatch factors individual- 
ly or in conjunction with one another. For example, 
general match factor alone will sometimes be suffi- 
ciently indicative of the degree of similarity between 
two chemical compounds. In other cases, general 
match factor alone will be inconclusive and it may 
prove necessary to consider either automatch factor 
or crossmatch factor, or both, to effectively distin- 
guish chemical compounds. 

In certain embodiments, the present invention 
also provides a method for analyzing chromatograph- 
ic data, along with UV spectral data, to determine on 
a peak-by-peak basis the best match for a given stan- 
dard in an unknown sample. In this regard, the para- 
meters retention time deviation (RTdw). peak area de- 
viation (ARdev)> peak height deviation (HTdev). and 
area and height deviation (AHd9v)are defined by the 
following functions: 

RTctev = IRT, - RTU2I/RT,,, (6) 

ARde. = |ARi - ARzl/AR,,^ (7) 

HT,« = |HTi - HTjI/HTnn, (8) 
AHdev = (ARdav + HT,ev)/2 (9) 
where the subscripts 1 and 2, respectively, denote ex- 
pected and actual data or data corresponding to any 
two chemical compounds, and lim indicates an exper- 
imentally or otherwise defined limited of variability for 
the indicated quantities. 

Thus, the provided peak assignment algorithm 
uses a hierarchical procedure which employs the va- 
rious parameters to select peaks corresponding to 
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chemical compounds which are to be paired and fur- 
ther analyzed. In accordance with certain embodi- 
ments, all unknown candidate peaks for each stan- 
dard inside an optional retention time window are 
ranked by increasing match discriminator. If the can- 5 
didate peak with the lowest match discriminator and 
the one with the next highest match discriminator dif- 
fer by more than one (1), the one with the lowest 
match discriminator is considered a positive identifi- 
cation. If the difference Is less than one (1), retention io 
time deviatk)n is considered next such that the peak 
with the lowest retention time deviation Is considered 
a positive match if the next highest retention time de- 
viation differs by more than one (1). If analysis of re- 
tention time deviation does not provide a statistically is 
significant result, the area and height deviation is 
analyzed in a similar fashion. It at this point, a posi- 
tive identificatton has not been reached, CN is con- 
sidered such that the candidate with the highest CN 
is selected as a match. 20 

It will be understood that peak assignment be- 
tween standards and unknowns has to be by direc- 
tionally unambiguous; that is, each standard can only 
be matched by one unknown and vice versa. Thus, in 
cases where two different standards are matched by 25 
the same unknown peak, the priority of standards Is 
established In accordance with this Invention on the 
basis of the same rules used to determine the best un- 
known matched candidate. 

After successful peak assignment, there will be 30 
a defined, unambiguous relationship between the 
peaks in the standard and the unknown such that at 
most one and possibly no peak is assigned for the un- 
known to each peak from the standard. Consequent- 
ly, two possibilities exists for peak assignment (1) all 35 
peaks in a standard have one peak for the unknown 
assigned to them and the unknown contains zero or 
more extra peaks that do not correspond to any stan- 
dard; or (2) not all peaks in a standard have been as- 
signed unknown peaks and the unknown contains 40 
zero or more extra peaks that do not correspond to 
any standard. 

In accordance with certain embodiments, a peak 
score (PS) is next calculated for all pairs of success- 
fully assigned peaks as follows: 45 

PS = [(f„MT,,3) + (fr-RTdev) + (faAHdev)]/NF 

(10) 

where f^, fr, and fa are variable weighting factors for 
match discriminator, retention time deviation, and 
area and height deviation, and NF is an empirically so 
derived normalization factor, typically three (3), equal 
to the number of parameters employed. 

As a further Indication of confidence In a given 
peak match, the difference in peak score between the 
candidate peak and the next best match can be used. 55 
It is also possible to reverse the order in which reten- 
tion time deviation, area and height deviation and 
crossover number are used to resolve ambiguous 



matches, or to not Include either or all values in the 
comparison. For example. If it is known that the re- 
sponse can vary from sample to sample, It might 
make sense not to use response matching. If, on the 
other hand, the same sample is analyzed using dif- 
ferent chromatographic conditions, retention time de- 
viation might be meaningless and area and height de- 
viation could be used In peak tracking. It will be ap- 
preciated that such considerations will depend Inti- 
mately upon each particular analysis and the facts as- 
sociated therewith. 

In one embodiment of the present invention, a 
modification of the algorithms accounts for the pos- 
sibility that a chromatographic peak In the unknown 
might actually contain more than one component In 
such embodiment, each candidate peak is checked 
for the presence of all the standards occurring In the 
pre-selected retention time window using multiconv 
ponent analysis. All but one of the standards are then 
subtracted from the unknown spectrum at the con- 
cent ratk>n determined and the resulting corrected 
spectrum Is matched against the remaining stan- 
dards as previously discussed. 

Once peak score has been defined, a sample 
score (SS) can be defined as follows: 

SS = pPS + (piEP) + (P2-MP)]/N (11) 
where the individual peak scores are summed over all 
standard peaks successfully matched from the un- 
known, EP are extra peaks not present in the stan- 
dard and are weighted by factor pi, missing peaks 
(MP) are weighted by a penalty score P2, and N Is the 
total number of standard peaks expected. It will be ap- 
preciated by those skilled in the art that sample 
scores for well characterized reference materials can 
be analyzed to arrive at reasonable confidence limits 
for sample score. Scores for unknown samples can 
then be compared and their similarity to the standard 
can be Indicated by the difference In sample scores. 

While the principles of the present invention are 
described as they apply to chromatograms produced 
by HPLC, it is Intended that the theories and methods 
described herein are equally applicable to chromato- 
grams produced by other well known methods, such 
as gas chromatography and liquid techniques other 
than HPLC, such as capillary zone electrophoresis. 

It is also intended that spectral data amenable to 
the practice of this invention may be derived from ul- 
traviolet, visible, fluorescence, infrared, Raman, 
atomic at)sorption, nuclear magnetic resonance, and 
mass spectroscopic devices. It is preferred that any 
such spectroscopic device provide electromagnetic 
radiation having reproducible wavelength. It is partic- 
ularly preferred that UV instruments be employed, 
due to both the generally high reproducibility of UV ra- 
dlatk>n and the consistent manner in which absor- 
bance at one UV wavelength relates to absorbances 
at neighboring wavelengths. This is to be contrasted 
with discrete banded spectra encountered, forexam- 
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pie. in nuclear magnetic resonance spectroscopy. 

Additional objects, advantages, and novel fea- 
tures of this invention will become apparent to those 
skilled in the art upon examination of the following ex- 
amples thereof concerning the identification of pep- 
tide fragments from a tryptic digest of recombinant- 
DISIA-derived human growth hormone (r-hGH). 

Preparation of Tryptic Digest of r-hGH 

Samples of r-hGH were oxidized by adding 50 ^1 
of chilled performic acid (nine parts 88% formic acid 
and one part 30% hydrogen peroxide) to 1.0 mg r- 
hGH and reacting the mbcture for one hour at O^'C. 

Samples were digested in a buffer solution con- 
taining 100 mM sodium acetate, 10 mM Tris base and 
1 mM calcium chloride at pH 8.3 at 37°C by addition 
of 1:100 trypsin (trypsin:r-hGH, by weight) at times 
zero and at two hours. Samples were acidified after 
a total of four hours with 1 00 ^1 of phosphoric acid (pH 
less than 3) per milliliter of sample and analyzed di- 
rectly or stored for up to three days at 2-8*'C. The di- 
gestion of r-hGH was complete after four hours. 

Separation by HPLC 

HPLC separations were performed using a Hew- 
lett-Packard 1090M HPLC system equipped with a 
DR5 ternary pumping system, an automated injection 
and sampling system, a heated column compartment 
and a diode-array detector, and controlled by an 
HP79994AChemStatlon. 

Two gradient systems were employed for the sep- 
aration of the tryptic fragments. System I used tri- 
fluoroacetic acid (TFA) in water at 0.1% as solvent A. 
with 0.8% TFA in acetonitrile as solvent B. The gra- 
dient was linearfrom 0 to 60% B between 0 and 120 
minutes at a flow-rate of 1 ml/min with the oven tem- 
perature set at40°C. System II utilized 50 mM sodium 
phosphate in water, pH 2.85. as sokent A; solvent B 
was acetonitrile. The gradient profile was linearfrom 
0 to 40% B over 120 minutes at a flow-rate of 1 ml/min 
with the oven temperature set to 40°C. For both gra- 
dient systems a 15 cm x 0.46 cm Nudeosil C^s re- 
versed phase column was used with particle size 5 
^m, pore size 100 A, packed by Alltech Associates. 
Figure 2 shows a typical chromatogram of a mbcture 
of tryptic peptides derived from an r-hGH reference 
standard analyzed with the TFA gradient system. 

Data Processing 

For all analyses, spectra were acquired at one- 
second intervals over the range from 200 to 350 nm. 
In addition, chromatographic signals were recorded at 
220, 230. 254, 274, 280. and 292 nm with a reference 
wavelength of 350 nm in all cases. Raw data were 
stored on magnetic media and were processed on the 



ChemStation using the built-in spectral library func- 
tions as well as additional evaluation software that 
was written for that purpose using a high-level conv 
mand language available on the ChemStation. 

5 

Spectral Matching 

Numerical point by point comparison of the two 
UV spectra was implemented on ChemStation with 

10 the COMPARE command described in A. Drouen, 
The Compare Command. Information Note, Publica- 
tion Number 12-5952-3725, Hewlett-Packard GmbH, 
Waldbrom, FRG (1987). This comparison is illustrat- 
ed in Figure 4 where spectra for peptkles T1 3 and T1 4 

15 are compared. At each wavelength, absorbance val- 
ues for the two peptide spectra are plotted as abscis- 
sa and ordinate and a linear regression is applied to 
the resulting scatter plot as shown Figure 4b. The 
square of the conrelation coefficient, multiplied by 

20 1 000, is defined as the match factor for the two spec- 
tra. Those skilled in the art will appreciate that the two 
peptides shown in Figure 4a differ in the nature of the 
aromatic amino acid residue which is phenylalanine 
forT13 and tyrosineforT14, Theirspectra are clearly 

25 different even on visual comparison and the match 
factor accordingly has a low value of 91 9. 

Figure 3 illustrates how the match factor is effect- 
ed when T13 was compared with T12, a peptide frag- 
ment which does not contain any aromatic amino acid 

30 at all. The corresponding spectra are very similar and 
the match factor increases to 997 (Figure 3b), ap- 
proaching the value expected for identical spectra. 

Compilation of Spectral Calibration Library 

35 

A library of standard spectra for the various frag- 
ments in the tryptic map of r-hGH was next compi led. 
For this purpose, a reference standard was Injected 
four times and analyzed with gradient systems 1 (TFA 

40 based) and II (phosphate based). Each of the result- 
ing data files was then processed. 

After integration of the signal at 220 nm, apex 
spectra were identified for all integrated peaks. They 
were corrected for solvent background by subtracting 

45 a reference spectrum which was interpolated from 
two base line spectra at either side of the peak. The 
resulting peak spectra were then stored into a library 
file which was referred to as a sample library since it 
contained all spectra characteristic of a given sample. 

50 The two-point reference correction employed 
was especially important in the case of gradient I 
since TFA undergoes a significant change in spectral 
properties as the acetonitrile concentration is in- 
creased during the course of the gradient el ution. Fig- 

55 ure 5 illustrates how the uncorrected upslope, down- 
slope, and apex spectra for fragment T9 differ signif- 
icantly from the standard Tg spectrum. After baseline 
correction, all three spectra matched the standard 
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spectaim closely, as shown in Figure 5b. 

Next, a retention time window of ±0.5 minutes 
centered on the apex of each peak from the first stan- 
dard was employed to find the spectrum with the best 
match from each of the other three standards. Those 
spectra that were common to all four standards were 
than averaged, nonmalized, smoothed, and transfer- 
red into a new spectral library file with was named the 
calibration library. For each peak in the tryptic map, 
this library file contains the UV spectrum and values 
for area, height, retention time, and scaling factor; all 
values were based on averages from the four stan- 
dard runs. 

As discussed in W.S. Hancock, etal., Cold spring 
Hariyor Symposium, (1988) p.95, the Identity of the 
trypticfragmentshad been determined by amino acid 
analysis and fast atom bombardment mass spectro- 
metry. Library entries for peaks eluting prk)r to the 
first and after the last tryptic fragment, as well as en- 
tries for peaks with area or height below 1% of total 
area or height, were then removed. As had been 
shown in the Hancock reference, most of the minor 
peaks were not related to r-hGH but were nonspecific 
background, presumably derived from trypsin or due 
to other interferences, such as baseline noise or sol- 
vent impurities. 

The final calibration library for the TFA system 
contained 40 entries, 19 of which represented tryptic 
fragments of known identity. The phosphate library in 
Its final from consisted of 31 entries. These two cal- 
ibration libraries were used in all subsequent experi- 
ments. 

It should be noted that correlation of data from 
different standard runs relies heavily on good chro- 
matographic reproducibility. In Figure 6, chromato- 
graphic traces from four replicates analyzed with gra- 
dient II are overlaid to demonstrate excellent Instru- 
ment performance even towards the end of the gra- 
dient. Statistical analysis of retention time variations 
showed the average standard deviation for all peaks 
incorporated Into the calibration library to be 0.027 
min (1 .6 s) and 0.021 min (1 .3 s) for gradient system 
I and II respectively. 

Determination of Reproducibility and Selectivity of 
the Calibration Library 

Since two key properties of the match factor that 
determine the usefulness of the spectral data incor- 
porated into the calibration library are sensitivity and 
selectivity, it was decided to Investigate these prop- 
erties in a systematic fashion in order to obtain some 
quantitative guidelines. Results were obtained using 
gradient I since TFA, when employed as modifier, 
presents a greater challenge for a liquid chromato- 
graph detector and pump than does phosphate. 

Reproducibility of the match factor determines 
the absolute limit for the similarity between any two 



spectra and thus defines the sensitivity of spectral 
matching. Two spectra can be consklered different 
only when mean and standard deviation for the match 
between the two differ significantly from those ob- 

5 tained by repeatedly matching klentical spectra. It Is 
not sufficient to use a match factor cutoff as criteria 
for a positive identification. Additional statistical infor- 
mation is needed to determine the significance of a 
given match factor. 

10 Spectra for T1 3 or T14 derived from eleven dif- 
ferent injections were averaged to obtain a represen- 
tative spectrum for each peptide. All individual spec- 
tra were then matched against their respective aver- 
age (automatching, as shown In Figure 4c) and there- 

15 suiting distribution of match factors was compared 
with that obtained from nfiatching individual T1 3 spec- 
tra against the average T14 spectrum and vice versa 
(crossmatching, as shown In Figure 4c). It can be 
seen that the means for automatch factor and cross- 

20 match factor are quite different; the average value for 
the crossmatch factor of 91 8.6 Is certainly a good in- 
dication of dissimilarity. More importantly, confi- 
dence intervals of three standard deviations above 
and below each mean as indicated in Figure 4c do not 

25 overlap, but show a significant gap. Thus, T1 3 can be 
distinguished from T14 with a great degree of confi- 
dence. 

Figure 3c shows the corresponding plot of auto- 
match factors and crossnrmtch factors for T13 and 

30 T12. These peptides are very similar In their spectral 
characteristics as can be seen by the mean cross- 
match factor score of 997.25. Nonetheless, there Is 
still a clear gap between the confidence Intervals for 
automatch factor and crossmatch factor. Indicating 

35 that it Is possible to differentiate between compounds 
of extreme similarity. In statistical terms, if Studenfs 
/-test is applied to the data in Figure 3c, a f-value of 
57 Is obtained along with a probability of better than 
99.99% that the mean values obtained for automatch 

40 factor and crossmatch factor are Indeed different 

The f-test for the comparison for T13 and T14 
(Figure 4c) results In a f-value of 542 and a probability 
of 100.00% that the spectra are different f-Values 
representing the similarity among the four aliphatic 

45 peptides (T7, T8, Til, and T12) ranged from 13 to 
133, which Is sufficient for statistically valid distinc- 
tion. It will be appreciated by those skilled In the art 
thatfor a population size of 11 , a f-value of at least 6.2 
is required to provide greater than 99.99% probability 

50 that two means aro different 

When the reproducibility of match factors for the 
four standard runs using gradient I were analyzed, it 
was found that the match factor ranged from 998.76 
to 1000.00, with standard deviations from less than 

55 0.001 to 1.306. This indicated that very stringent 
match criteria could be employed for spectral identity. 
Since variability of the match factor increases as 
peak concentrations decrease and since the relative 
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concentrations of the tryptic fragments from r-hGH 
should be fairly constant it was decided to define in- 
dividual match criteria for each entry in the calibration 
library rather than use a fixed match threshold. To be 
considered a positive match, an unknown spectrum 5 
had to have a match score above a threshold of three 
standard deviations below the mean match fora given 
standard. This provided a 99.8% probability that only 
correct matches were assigned. 

To establish selectivity of the calibration library, io 
each standard in the calibration library was matched 
against every entry from a typical sample library to 
determine the number of potential mismatches. Amis- 
match in this context was defined as a standard entry 
for which more than one match candidate was found is 
with a match factor inside the confidence limits pre- 
viously established. According to certain embodi- 
ments, selectivity can be greatly enhanced by defin- 
ing a retention time window around a given standard 
to limit the number of search candidates. For exanv 20 
pie, a retention time window of ±1 min was employed, 
incorrect matches were found for only three stan- 
dards. These mismatches were all minor peaks with 
peak heights between 3 and 6 milli absorbance units 
(nfiAU) and did not correspond to any known tryptic 25 
fragments of r-hGH. With a ±0.5 min window, no mis- 
matches were found. It was thus concluded that with 
the selection of an appropriate retention time window, 
the calibration library for r-hGH provkles accurate 
identif icatton of all fragments. 30 

Traditional calibratk)n procedures for a peak 
identification such as implemented in the standard 
ChemStatlon software and similar in nature to other 
commercially available software for chromatographic 
data handling where peak recognitbn is based only 35 
on retention times resulted in mismatches for 5-8 
standards inside a ±0.5 min retention time window. 
When the window was increased to ±1 min nearly all 
standards exhibited mismatched peaks. 

40 

Definition and Application of the Peak Score 

It will be appreciated by those skilled in the art 
that since chromatographic conditions are not always 
stable, resolution between adjacent peaks may 45 
change or additional peaks may appear in a tryptic 
map. Such instability will make positive identification 
of an unknown peak difficult, even when spectral 
matching is employed. However, in addition to peak 
spectra, other quantitative information is available for so 
each peak and can be utilized in accordance with cer- 
tain embodiments of the present invention to develop 
a procedure that will assign a numerical similarity 
score to each match between a standard and an un- 
known peak. Figure 7 shows the variability of the dif- 55 
ferent parameters available to construct this scoro. 
Based on the relative standard deviations, it is obvi- 
ous that the greatest confidence can be placed In the 



match factor. It can be seen that retention time infor- 
mation and peak area and height exhibit deviation 
larger than those for the match factor by one and two 
orders of magnitude, respectively. 

Based on the statistical information in Figure 7, 
the peak score can be empirically derived as follows: 
PS = [10.MTdb + RTctov + 0.1(ARdev.HTdev)l/11.2 
(12) 

where, to avoid unrealistically high delta values, the 
following minimum values were established: 0.1 for 
MTdis. 0.05 min for RTdev. and 1% for AR^^v and HTdov. 
In this manner, equation (12) accounts for the fact 
that the spectral match is the most significant para- 
meter for peak recognition and therefore is weighted 
most heavily. Even if all other parameters indicate a 
perfect match, a large deviation in the match factor in- 
dicates that the peak in questions has the wrong iden- 
tity. The scaling factor of 11 .2 is the sum of all weight- 
ing factors and normalizes the peak score to unit 
weight. 

By definition, a perfect peak score would be zero, 
a score of one will provide a 99.8% probability that 
positive matches will not be missed, but usually indi- 
cates rather marginal similarity between standard 
and unknown. Peak scores for alt entries in the four 
sample libraries used to construct the calibration li- 
brary ranged from 0.002 to 0.465 with an average 
score of 0.051. Because the score is open ended, it 
was somewhat arbitrarily decided that a score of two 
orlarger indicated a totally mismatched peak. It will be 
appreciated by those skilled in the art that the prob- 
ability that a positive match will result in a scoro of 2 
is less than 0.000002% 

Automated Evaluation of Digests Using a Sample 
Score 



Knowing how well a peak from a calibration libra- 
ry is matched by any given peak in an unknown sanrv 
pie, the next step is to develop a scoring procedure 
which describes the overall similarity between all of 
the peaks in the unknown and in a calibratbn sample. 
The sample score as previously defined allows for the 
accounting of missed calibration peaks as well as for 
supernumerary peaks found in a sample. Further- 
more, the scoro is normalized so as to be independent 
of the number of entries in the calibration library. Nor- 
malization becomes a concern if the library is nriodi- 
f led. Since peak scores larger than 2 have been de- 
fined as mismatches, all peak scores are truncated to 
2 so that missed and mismatched peaks have the 
same peak score. The penalty score of 1 for extra 
peaks is strictly empirical at this point; another pos- 
sible approach would be to have the penalty reflect 
the size of the extra peak. 

Whi le a perfect sample score is easi ly defined as 
being exactly zero, a determination must be made 
concerning a criterion for what constitutes the limit 
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between a passing and a failing score. Meaningful 
limits will have to be established through statistical 
analysis of typical sample scores for reference stan- 
dards to account for variability due to different lots of 
growth hormone and trypsin, as well as overall chro- 5 
matographic variability. 

Figure 8a provides the sample scores for the four 
sample libraries (1 A-D) used to construct the calibra- 
tion library as well as for additional samples (2A-C 
and 3A-D) derived from the same reference standard io 
but Injected in different amounts. As expected, the 
calibration samples themselves (1A-D), injected at 
1 0Ong, show a very good score of 0.076 or less, with 
an average value of 0.050, indicative of the extreme 
similarity between all four replicates. is 

The Increase in sample score for the 50ng injec- 
tions (2A-C) to an average value of 0.798 is partly due 
to a drift in chromatographic conditions resulting in re- 
solution changes for several peaks. The co-eluting 
fragments T14a and T14c were separated into two 20 
peaks, each with a spectrum different from the conr>- 
posite spectrum contained in the calibration library. 
The partially resolved peak pair Til and T10c2 (Fig- 
ure 2) was not separated at ail and, consequently, nei- 
ther fragment was identified. Furthermore, the frag- 25 
ment with the lowest concentration (T1 9) was not de- 
tected at this smaller sample size. 

The 200^g injections (3A-D) show an average 
score of 0.443, and thus fall between the 1 00 and the 
50^g samples. The increased sample score results 30 
from the same problematic peaks encountered with 
the 50ng injection. In both the 50 and the 200^g in- 
jection, the additional standard peaks which were 
missing were all small peaks of unknown identity. 
This indicated that the significance of these uniden- 35 
tif ied peaks with respect to sample identity needed to 
be investigated in some more detail. 

For the phosphate gradient systenns (gradient II) 
similar data are shown in Figure 8b. Again, the four 
calibration samples (1 A-D) exhibit very low scores of 40 
0.064 and less, with the average at 0.036. An addi- 
tional sample (2), which also contains reference ma- 
terial but which was analyzed at a different time, 
shows a higher score of 0.671. This score is in the 
range of scores obtained for the 50 and 200)ig injec- 45 
tions of reference material with gradient I. Closer in- 
spection revealed that here, too, changes in peak re- 
solution had an adverse affect on the sample score. 

In order to provide data on the kind of sample 
score obtained with a sample known to differ from the so 
standard, samples of r-hGH which was oxidized prtor 
to digestion with trypsin were analyzed to simulate 
potential degradation pathways. As can be seen quite 
clearly in Figure 8b, 3A-D. the average sample score 
of 1 .692 lies significantly above the scores obtained 55 
for reference material and reflects the difference be- 
tween oxidized and native r-hGH. Furthermore, re- 
producibility for the four samples is very good, indi- 



cative of the similarity among replicate injections of 
the oxidized samples. 

To relate this abstract score to the more tradition- 
al visual method of evaluation, Figure 9 shows a chro- 
matogram for the oxidized r-hGH digest. Peaks that 
disappeared due to oxidation and those peaks that 
appear as new fragments and are not encountered in 
native r-hGH are clearly labeled. 

Thus, although it is obvious that the chromato- 
gram in Figure 9 differs considerably from the stan- 
dard fragmentation pattern as indicated by the ar- 
rows, the present invention provides some dear ad- 
vantages in reducing the potential for Incorrect peak 
matching: (1) the entire evaluation procedure can be 
automated to obtain a final sample score without the 
need for operator intervention; and (2) the scoring 
procedure is completely digital and therefore not sub- 
ject to observer bias. 

Turning to Figures 10-13, application of the meth- 
od of the present invention will be described. It should 
be understood that where input is to be supplied to a 
program or subprogram said input can be provided in 
interactive mode by an operator or can be taken di- 
rectly from a file containing the pertinent infonmation. 

The subprogram Make-Library (Figure 10) imple- 
ments the reductk)n of raw data to the two data sets 
described in this invention. User input specific to this 
subprogram, such as the names for input and output 
files, wavelength selection, and integration parame- 
ters, is supplied at step 101. 

The file retrieved at step 102 is a raw data file 
containing absorbance data as a function of both wa- 
velength and time as would be appropriate for the in- 
formation generated by a diode-array detector. Any 
such fonmat could in principle be processed by the 
subroutine, provided that low level routines for inter- 
pretatk>n of the file format are available. In the pre- 
ferred embodiment of the invention the format of raw 
data is that produced by the Hewlett-Packard (HP) di- 
ode-array detector. 

After raw data have been retrieved from the mag- 
netic media, an appropriate signal characterizing the 
chromatographic peak response is chosen for analy- 
sis of peak data at step 103. A typical peak response 
would be the absorbance as a function of time at spe- 
cific wavelength or wavelength range, selected such 
that alt compounds of Interest will exhibit absorbance 
at said wavelength or wavelength range. However, it 
is possible to use the average or maximum absor- 
bance over the wavelength range recorded-or a sub- 
range thereof-as the peak response at a given time 
point. 

Once a signal has been determined, the subpro- 
gram finds all peaks for this signal in step 1 04 by enrv 
ploying standard integratbn algorithms as imple- 
mented on the HP ChemStation or any other such al- 
gorithm similar in nature to those customarily env 
ployed in chromatographic data handling. The result 
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of the peak finding step is the determination of peak 
start, end. apex (retention time), area, and height, as 
well as of the number of peaks encountered, which is 
assigned to variable P in step 105. 

At step 106, a library file is created which will lat- 
er receive relevant peak data as generated in subse- 
quent portions of this subprogram. This library file is 
typically referred to as a sample library. 

Next a counter is initialized to a value of 1 at step 
1 07 and the apex spectrum for the peak indexed by 
the counter is found by the subprogram at step 108. 
Appropriate reference spectra are then selected at 
step 109, typically at the beginning and end of the 
peak where normally only the solvent background is 
present. Other criteria for the selection might be em- 
ployed, especially in cases where neighboring peaks 
are not fully separated. The number of reference 
spectra employed may also be varied depending on 
the characteristics of the chromatographic system 
employed. 

In step 110, the reference spectra are then used 
to remove unwanted background absorbance from 
the apex spectrum in orderto obtain a peak spectrum 
characteristic of the current peak. Although a number 
of different approaches can be used to construct this 
background conrectk)n, the preferred mode is to use 
linear interpolation of the reference spectra to the re- 
tention time of the apex spectrum and to subtract the 
interpolated spectrum from the apex spectrum. An- 
other approach would, for example, involve principal 
component analysis of the solvent background fol- 
lowed by linear least squares subtraction. 

At step 111, an optional wavelength calibration 
can be applied to the peak spectrum by shifting the 
wavelength axis left or right by a constant wavelength 
anrK)unt as determined previously outside the scope 
of the subprogram. This background correction is inv 
portant primarily in cases were data for different sanv 
pies might be obtained from different instruments or 
be derived over long periods of time on the same in- 
strument 

At step 1 1 2 any number of possible mathematical 
treatments can be applied to the peak spectrum. Ex- 
amples of such treatments are smoothing, the forma- 
tion of higher order derivatives, splining of the wave- 
length axis to obtain better resolution, or any transfor- 
mation of the spectrum. 

The peak spectrum is transferred to the sample 
library at step 113 and the other peak data for the cur- 
rent peak as detenmined during the integration step 
(104) are transferred to the sample library at step 114. 
Finally, at step 115, the counter is incremented and 
checked against the number of peaks P in step 116. 
If another peak needs to be processed the subpro- 
gram returns to step 108, otherwise the subprogram 
execution is complete. 

The Compare-Libs subprogram (Figure 11) pro- 
vides for most of the detailed matching between any 



two samples presented to the subprogram in form of 
a sample library for each sample. In implementing 
this subprogram, the first sample is considered to be 
the reference or standard sample to be matched by 

5 the second sample. It will, however, be understood 
that the first sample can be of completely unknown 
nature, as can the second sample. It should also be 
understood that a 'sample library' can contain data 
from either a single analysis of a sample processed 

10 by the Make-Library subroutine or data derived from 
multiple analyses of the same sample as they would 
be correlated by the Make-Std-Library subprogram 
from sample libraries generated with the Make- 
Library subprogram. 

15 In step 201, user parameters pertinent to this 
subprogram, are requested. User parameters include 
the names of the sample libraries involved as well as 
parameters describing the characteristics of the 
matching process. 

20 In step 202 the first (reference) sample library is 
retrieved from magnetic media and is referred to as 
LI. The number of peaks stored in this library is de- 
termined and assigned to variable PI in step 203. 
Steps 204 and 205 repeat the previous two steps 

25 for the second sample library, assigning the library 
name to L2 and the number of peaks P2, respectively. 

Step 206 consists of a retention time correction, 
whereby reference peaks defined in the reference 
sample and expected to occur at the retentbn times 

30 stored in the reference sample library are compared 
against the retention times actually encountered in 
the second sample. Appropriate corrections are per- 
formed to the retention times of the second sample to 
make them correspond to those of the first sample. 

35 Any one of a variety of possible procedures can be 
employed in this correction process, the simplest of 
which is piecewise linear fit between expected and 
actual retention times. Those skilled in the art will rec- 
ognize that this correction may not be necessary. In 

40 step 207 peak areas and peak heights of both the first 
and second samples can be normalized in a number 
of ways. Two possible methods are normalization to 
the total area and height of all peaks in either sample 
such that all peaks are scaled to obtain an arbitrarily 

45 selected constant value for these parameters or to the 
area and height of selected reference peaks where 
normalization implies that all peaks are scaled to ob- 
tain arbitrarily selected constant values for these ref- 
erence peaks. Depending on the nature of the chro- 

50 matographic separation applied to the two samples, 
this step may not prove necessary. 

Next, two counters are initialized in step 208, one 
for the peak currently to be matched is set to 1 (i), the 
other one (k) will count the number of matches found 

65 for the current peak up to a maximum of 1 0 which will 
be stored in a table of match values. 

In step 209, relevant peak data for the peak cur- 
rently indexed by i are retrieved from LI and a reten- 
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tion time window centered upon the retention time of 
the current peak is constructed in step 210. This re- 
tention time window depends on knowledge of the 
chromatographic system employed in the separation 
of the first and second samples and can be extended 
to the total time spanned by the analysis of the first 
sample. 

A second peak counter 0) for peaks In the second 
sample is initialized to 1 in step 211 and data for the 
peak indexed by j are retrieved from 12 at step 212. 
A branch point is provided at step 213 which tests 
whether peak j is inside the retention time window de- 
fined in step 210. If it is not, control passes to step 
221. Otherwise, the subprogram continues on to step 
214, where MTdb and CN are calculated for the data 
from peaks i in LI and j in L2, 

Those skilled in the art will recognize that the cal- 
culation of MTdb and CN can be done In a number of 
different ways as described elsewhere in this Inven- 
tion depending on the amount of information avail- 
able for each peak such as multiple or average or in- 
dividual spectra for the first or the second or both 
samples. 

In step 215 any or all of the deviations defined in 
equations (4)-(8) are calculated from the relevant 
data for peak i in LI and peak j in L2. 

Next, in step 216, the number (k) of matches 
found so far is compared against the maximum nunv 
berof matches allowed, which is arbitrarily set a con- 
stant value of 10, but could be modified to any other 
meaningful value. If less than 10 nnatches have been 
found the match counter is incremented in step 219. 
Otherwise, the match for the current peak is consid- 
ered better than any of those currently stored. The 
match with the lowest score is deleted in step 218 and 
execution proceeds to step 220. Otherwise, control is 
transferred to step 221 . 

At step 220 the two branches of step 21 6 and the 
yes branch of step 217 converge again and the match 
infonmation for the current peak is inserted into the 
match table at the appropriate position. 

At step 221 the counter j for the current peak in 
L2 Is incremented and tested in step 222 against P2. 
the total number of peaks in L2. If j exceeds P2, the 
subprogram continues with step 223; otherwise, the 
next peak from L2 is processed by returning to step 
212. 

In step 223 the counter i for peaks in LI Is incre- 
mented and tested against PI, the total number of 
peaks in LI in step 224. If i exceeds LI, the subpro- 
gram continues with step 225; otherwise the next 
peak from L1 is processed by returning to step 209. 

In step 225 peak assignment takes place b&- 
tween alt peaks in LI and all matches in the match ta- 
ble such that all conflicts are resolved by the hierarch- 
ical assignment procedure described in this invention. 
No more than one peak from L2 is assigned to each 
peak of LI and no peak from 12 is assigned to nrrare 



than one peak from L1. 

Once peak assignment is complete, the peak 
score PS as defined in equation (10) is calculated in 
step 226 for each pair of matched peaks found in step 

5 225 and the subprogram is tenminated. 

The Make-Std-Library subprogram (Figure 12) is 
used to correlate data from one or several sample li- 
braries to arrive at a standard library which contains 
statistical information derived from data sets 1 and 2 

10 for all peaks, as well as from the original data from the 
individual libraries. 

At step 301 user input is requested and assigned 
to variable L. User input may include information such 
as file names and the number of sample libraries to 

15 be processed. 

Next, in step 302, a temporary scratch library 
TEMP is created which will be used in the correlation. 
This library initially contains peak data on all peaks in 
the first sample library. 

20 At step 303, a counter is initialized to 2 and tested 
in step 304 against the total number L of sample librar- 
ies. If the counter exceeds L the correlation process 
is complete and statistical processing commences at 
step 313. Otherwise, the subprogram proceeds to 

25 step 305. 

At step 305 the current library indexed by j is conrv 
pared to TEMP using the subroutine Compare-Libs 
described above. The Invocation of Compare-Libs 
will result in an assignment between peaks in TEMP 

30 as reference library and peaks in the current sample 
library. Peak assignment between a given pair of 
peaks is considered positive if the peak score as re- 
turned by Compare-Libs is above a user-selected 
threshold. Any peaks in the cunrent library not as- 

35 signed to a peak from TEMP are then removed, to- 
gether with all relevant peak data in step 308. 

Step 307 initializes a second counter j to a value 
one lower than the current value of I. Steps 308 to 311 
will delete all peaks in TEMP that were not matched 

40 by any peak in the current sample library or the cor- 
responding peaks in alt sample libraries already proc- 
essed. Therefore, after step 311 all sample libraries, 
up to the cunrent one, and library TEMP contain the 
same number of peaks which are alt conrelated on a 

45 one by one basis. 

If j tests larger than 0 in step 308, the subprogram 
proceeds to step 309 where all peaks corresponding 
to unmatched peaks in TEMP will be deleted in the 
sample library index by j. In step 310 j is then decre- 

50 mented and execution returns to step 308 until j tests 
equal to zero (0), in which case the subprogram con- 
tinues with step 311 . At that point the subprogram de- 
letes the unmatched peaks from TEMP itself in step 
311, increments counter i in step 312, and returns to 

55 step 304. 

Beginning at step 313, statistical processing of all 
sample libraries correlated takes place. Program con- 
trol is transferred to this step from step 304 if the test 
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there indicates that all libraries have been processed 
(i.e., counter i exceeds the value of L). 

In step 313 the number of peaks remaining in 
TEMP - and thus In all sample libraries -- is deter- 
mined and assigned to variable P. A new library file is s 
created In step 314 to receive the data generated by 
the subsequent processing steps. This will be the 
standard library produced by the subprogram. 

Counter 1 is again initialized to 1 in step 315 and 
the peak spectrum for the peak Indexed by 1 is trans- io 
ferred from each sample library to the standard libra- 
ry In step 316. An average spectrum is calculated 
from the indh^iduat peak spectra and also stored in 
the standard library in step 317. 

Individual peak data for the current peak from is 
each of the sample libraries are transferred to the 
standard library In step 318. This is followed by peak 
data averaging in each category, which data are stor- 
ed In step 319. 

In step 320 all appropriate spectral matches Ma 20 
are calculated from the indivkJual and average spec- 
tra and transferred to the standard library in step 321 . 

The counter is then incremented In step 322 and 
tested against the total number of peaks P. If i exceeds 
P, the program Is terminated. Otherwise, the next 25 
peak Is processed by returning to step 316. 

The Get-Sample-Score program (Figure 13) in- 
corporates the previously described subprograms to 
arrive at an overall sample score indicative of the sinv 
llarlty between any two samples analyzed by the 30 
same or different chromatographic conditions on the 
same or different instruments. The overall procedure 
that results in the sample score will also identify 
those peaks in the two samples that can be consid- 
ered to be derived from the same chemical compound 35 
present in the two samples. 

The overall procedure assumes that raw data for 
the number of replicates and R2 defined for the first 
and second sample, respectively, are available. This 
does not preclude the possibility that these data are 40 
generated concunrently with execution to Get-Sam- 
ple-Score. Such concurrent generation would enable 
completely unattended operation of the overall sam- 
ple scoring procedure. 

In step 401 user Input specific to the overall 45 
matching procedure Is requested. Such input includes 
such Items as f i le names, match criteria for Compare- 
Llbs, criteria for correlation of sample libraries by 
Make-Std-Library. and the weighting factors used for 
the calculation of sample score. 50 

In step 402 a standard library (SI) characteristic 
of the first sample and containing data for Ri repli- 
cates can be provided. If one Is available, program 
execution is transferred to step 410. Otherwise, a 
standard library is generated In steps 403 through 55 
409. 

In step 403 input is requested concerning the 
numberof replicates for the first sample and assigned 



to the variable Ri. Next a counter Is initialized to 1 1n 
step 404 and the raw data for the replicate analysis 
of the first sample as Indexed by the counter Is re- 
trieved in step 405. Subroutine Make-Library Is In- 
voked in step 406 to produce a sample library for the 
current replicate. The counter is Incremented In step 
407 and If more replicates are to be processed as test- 
ed In step 408 the program returns to step 405. Other- 
wise, subprogram Make-Std-Library is called next In 
step 409 to generate a standard library S1 from the 
IndivkJual sample libraries. 

In step 410 a standard library (S2) characteristic 
of the second sample and containing data for R2 rep- 
licates can be provided. If such a standard is avail- 
able, program execution Is transferred to step 418. 
Otherwise, a standard library is generated In steps 
411 through 417. 

In step 411 input Is requested concerning the 
number of replicates for the second sample and as- 
signed to variable R2. Next, a counter Is Initialized to 
1 in step 412 and the raw data for the replicate ana- 
lysis of the second sample as indexed by the counter 
is retrieved In step 413. The subroutine Make-Library 
Is Invoked in step 414 to produce a sample library for 
the current replicate. The counter is Incremented In 
step 415 and If more replicates are to be processed 
as tested in step 416, the program returns to step 41 3. 
Otherwise, subprogram Make-Std-Library is called 
next in step 417 to generate a standard library S2 
from the individual sample libraries. 

In step 418 subprogram Compare-Libs Is used to 
match standard libraries SI and S2, resulting in out- 
put in step 419 of peak assignment and peak scores 
for each peak In the first sample. From the Individual 
peak scores the overall sample score can be calcu- 
lated based on equation (11) In step 420. Step 421 
provides for output of the sample score to an appro- 
priate device and in step 422 a final report is gener- 
ated which could incorporate Information on reprodu- 
cibility and confidence intervals previously obtained 
for sample scores from the two samples in question 
to make a decision as to whether or not the two sanrv 
pies are Identical. At this point program executbn Is 
complete. 



Claims 

1 . A method for distinguishing a first chemical conv 
pound from a second chemical compound on the 
basis of chromatographic data wherein said 
chemical compounds absorb ultraviolet radiation, 
comprising the steps of: 

exposing at least one of the chemical conv 
pounds one or more times to one or more select- 
ed wavelengths of ultraviolet radiation; 

recording the respective absorbances of 
at least one of the chemical compounds upon 
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each exposure to said ultraviolet radiation; 

providing a first data set to processing 
means, said first data set comprising the respec- 
tive absorbances for the first and second chemi- 
cal compounds upon one or more exposures to 5 
one or more selected wavelengths of ultravblet 
radiation; and 

providing at least one spectral match fac- 
tor by applying, via the processing means, a 
spectral matching function to the first data set; io 

characterized in that the method further 
comprises the steps of: 

deriving at least one match discriminator 
(MTdb) from at least one of the above-mentioned 
spectral match factors; is 

providing at least one peak score by apply- 
ing, via the processing means, a peak scoring 
function to the first data set, wherein providing at 
least one peak score comprises the steps of: 

provkiing to the processing means 20 
weighting factors (fm, fr, fa) for the match discrinrv 
Inators, retention time deviatbns (RTdev), and 
area and height deviations (AHdm). respectively, 
wherein the weighting factor (f^) for the match 
discriminators Is greater than the weighting factor 25 
(fr) for the retentk)n time deviations and the 
weighting factor (fa) for the area and height devia- 
tions; and 

applying the peak scoring function 
to the match discriminators, the retention time 30 
deviations, and the area and height deviations 
according to: 

PS = ((fn^MTdb) + (frRTdev) + (fa AH^av)) /NF 

where PS is peak score, f^ is a 
weighting factor for the match discriminator, ff is 35 
a weighting factor for the retention time deviation, 
and fa is a weighting factor for the area and height 
deviation, and NF Is an empirically derived nor- 
maiizatlon factor; 

distinguishing the first chemical confv 40 
pound from the second chemical compound on 
the basis of the peak score (PS). 

2. The method of daim 1 characterized in that the 
match discriminators (MTdb) are derived accord- 45 
ing to: 

MTdb = Dn-(DF,prob) 
where MTdb Is the match discriminator, D is the 
difference for the mean match factor derived 
from automatching and crossmatching functions, so 
DF is the degrees of freedom which are calculat- 
ed from the number of individual spectra for the 
first and second chemical compounds, and T(DF- 
,prob) is the t-value required for a desired degree 
of probability (prob. in %) that two means differ- 55 
ing by thatt-value are different given the degrees 
of freedom applicable. 



3. The method of daim 1 or 2 characterized in that 
the spectral matching function is applied to the 
first data set according to: 

MFg = 1000(1 -r2) 
where MFg is a general match factor and r Is a cor- 
relation coefficient which relates the absori^anc- 
es for the first chemical compound at selected 
wavelengths to the absorbances for the second 
chemical compound at the same wavelengths. 

4. The method of daim 1 or 2 characterized in that 
the spectral matching function is applied to the 
first data set and to the average absorbances ac- 
cording to: 

MFa = 1000 (1-r2) 
where MFg is an automatch factor and r is a cor- 
relation coefficient which relates the individual 
absorbances of a chemical compound at selected 
wavelengths to the average absorbances for the 
same chemical compound at the same wave- 
lengths. 

5. The method of daim 1 or 2 characterized in that 
the spectral matching function is applied to the 
first data set and to the average absorbances ac- 
cording to: 

MF, = 1000 (1 -r2) 
wherein MFx is a crossmatch factor and r Is a cor- 
relation coefficient which relates the individual 
absorbances for one of the chemical compounds 
at selected wavelengths to the average absor- 
bances for the other chemical compound at the 
same wavelengths. 

6. The method of one of the dainns 3 to 5 character- 
ized in that r is applied according to: 

. ^ [(Zxy) - (SxXSyynf] 

[{Sx2 - (2x)2/n,}{ly2 - (2:y)2/nf}]i'2 
where x and y, respectively, are the absori^ances 
of the first and second chemical compounds at 
the same wavelength, or where x and y. respec- 
tively, are Individual and averaged absorbances 
for the same chemical compound at the same wa- 
velength or where x and y, respectively, are the 
indivklual absorbances for one chemical conrv 
pound and averaged absorbances for the other 
chemical compound at the same wavelength, and 
where £ Is the summation function, and is the 
number of selected wavelength. 

7. The method as daimed in one of the claims 1 to 
6 characterized by the step of preparing the first 
data set after providing said data set to the proc- 
essing means, wherein the step of preparing the 
first data set comprises the step of: 

selecting a portion of the data set; and 
calibrating the selected portion. 
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8. The method as claimed in one of the claims 1 to 
7 characterized in that the step of providing said 
at least one match discriminator (MTdis) compris- 
es the step of applying, via the processing 
means, a match discrimination function to gener- s 
a! match factors. 

9. The method as claimed in one of the claims 1 to 
8, characterized in that the step of providing said 

at least one retention time deviation (RTdev) com- io 
prises the step of applying, via the processing 
means, a retention time deviation function to the 
retention times according to: 

RTdev = IRTi- RT2I /RT„„ 
wherein RTd«v is retention time deviation, RTi is 15 
the average of retention times for the first chenv 
ical compound, RT2 is the average of retention 
times for the second chemical compound, and 
RT||m is a limit of variability for retention times. 

20 

10. The method as claimed in one of the claims 1 to 

9 characterized by the steps of: 

providing at least one peak area deviation 
(ARdw) by applying, via the processing means, a 
peak area deviation function to the peak areas; 25 
and 

further distinguishing the first chemical 
compound from the second chemical compound 
on the basis of at least one peak area deviation; 
wherein the peak area deviation function is ap- 30 
plied to the peak areas according to: 

ARdov = IaRi - ARal/ARiim 
wherein AR^ is peak area deviation, ARi is the 
average of peak areas for the first chemical com- 
pound, AR2 is the average of peak areas for the 35 
second chemical compound, and AR^rn is a limit 
of variability for peak area. 

11. The method as claimed in one of the claims 1 to 

10 characterized by the steps of: 40 

providing at least one peak height devia- 
tion (HTdev) by applying, via the processing 
means, a peak height deviation function to the 
peak heights; and 

further distinguishing the first chemical 45 
compound from the second chemical compound 
on the basis of at least one peak height deviation; 

wherein the peak height deviation function 
is applied to the peak heights according to: 

HTdev = IhTi - HTzl/HT,,^ 50 
where HTdav is peak height deviation, HT, is the 
average peak heights for the first chemical conv 
pound, HT2 is the average peak heights for the 
second chemical compound, and HTnm is a limit 
of variability for peak height. 55 

12. The method as claimed in one of the clainr^s 1 to 

11 characterized in that the step of providing at 



least one area and height deviation (AHdev) conrv 
prises the step of applying, via the processing 
means, an area and height deviation function to 
the peak area deviations and the peak height de- 
viations according to: 

AHd« = (ARdev + HTdev)/2 
wherein AHdev >s area and height deviatton, AR^iev 
is peak area deviation, and HTdev is peak height 
deviation. 



Patentanspruche 

1. Ein Verfahren zum Unterscheiden einer ersten 
chemischen Verbindung von einer zweiten che- 
mischen Verbindung auf der Basis von chromato- 
graphischen Daten, bei dem die chemischen 
Komponenten ultraviolette Strahlung absorbie- 
ren, welches folgende Schritte einschlie&t: 
ein- Oder mehrmaliges Aussetzen von minde- 
stens einer der chemischen Verbindungen einer 
Oder mehreren ausgew3hlten Wellenl3ngen von 
ultravioletter Strahlung; 

Aufzeichnen der jewel ligen Absorptionsvermo- 
gen von mindestens einer der chemischen Kom- 
ponenten, jedesmal wenn sie der ultravioletten 
Strahlung ausgesetzt ist; 
Liefern eines ersten Datensatzes zu einer Verar- 
beitungseinrichtung, wobei der erste Datensatz 
die jeweiligen Absorptionsvermogen fur die erste 
und die zweite chemische Komponente, die ein- 
oder mehrmals einer oder mehreren ausgew§hl- 
ten Wellenlangen von ultravioletter Strahlung 
ausgesetzt waren, aufweist; und 
Schaffen mindestens eines spektralen Anpas- 
sungsfaktorsdurchAnwenden.uberdieVerarbei- 
tungseinrichtung, einer spektralen Anpassungs- 
funktion auf den ersten Datensatz; 
dadurch gekennzeichnet, da& das Verfahren fer- 
ner folgende Schritte aufweist: 
Ableiten mindestens eines Unterscheidungs- 
werts (MTdis) aus mindestens einem der oben ge- 
nannten spektralen Anpassungsfaktoren; 
Schaffen von mindestens einer Spitzenwertung 
durch Anwenden, uber die Verarbeitungseinrich- 
tung, einer Spitzenwertungsfunktion auf einen 
ersten Datensatz, wobei mindestens eine Spit- 
zenwertung geschaffen wird, wobei es folgende 
Schritte einschliedt: 

Liefern von Gewichtungsfaktoren (fm, fn fa) f urdie 
Anpassungsunterscheidungswerte. Retentions- 
zeitabweichungen (RTdev) und Richen- bzw. Hd- 
hen-Abweichungen (AHdev) zu der Verarben 
tungseinrichtung, wobei der Gewichtungsfaktor 
(fj fur die Anpassungsunterscheidungswerte 
grSfier ist, als der Gewichtungsfaktor (f^) fur die 
Retentionszeitabweichungen und der Gewich- 
tungsfaktor (fa) fur die Fl§chen- und H6hen-Ab- 
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weichungen; und 

Anwenden der Spitzenwertungsfunktion auf die 
Anpassungsunterscheidungswerte, die Retenti- 
onszeitabweichungen und die Flachen- und Hd- 
hen-Abwelchungen gemad folgender Gleichung: s 
PS = ((f^MTdb) + (frRTdev) + (faAH,„))/NF 
bei der PS die Spitzenwertung ist, f^, ein Gewich- 
tungsfaktor fur den Anpassungsunterschei- 
dungswert ist. fr ein Gewichtungsfaktor fur die 
Retentionszeitabweichung ist, fa ein Gewich- io 
tungsfaktor fur die Flachen- und Hohen-Abwei- 
chung ist, und NF ein empirisch abgeleiteter Nor- 
mierungsfaktor ist; 

wobei die erste chemische Verbindung von der 
zweiten chemischen Verbindung auf der Basis is 
der Spitzenwertung (PS) unterschieden wird. 

2. Das Verfahren nach Anspruch 1, dadurch ge- 
kennzeichnet, da& die Anpassungsunterschei- 
dungswerte (MTdis) gemad folgender Gleichung 20 
abgeleitet werden: 

MTdis = D/T(DF,prob) 
bei der MT^is der Anpassungsunterscheidungs- 
wert ist, D der Unterschied fur den mittleren An- 
passungsfaktor, der von den automatischen 25 
Anpassungs- und den Oberkreuzan- passu ngs- 
Funktionen abgeleitet ist, ist, DF die Freiheitsgra- 
de sind, die aus der Anzahl der einzetnen Spek- 
tren fur die erste und die zweite chemische Ver- 
bindung berechnet sind, und T (DF.prob) der t- 30 
Wert ist derf ur einen gewunschten Wahrschein- 
lichkeitsgrad (prob. in Prozent), da& zwei Mittel- 
werte, die sich um diesen t-Wert unterscheiden, 
verschieden sind, vorausgesetzt die Freiheits- 
grade sind anwendbar, erforderlich ist 35 

3. Verfahren nach Anspruch 1 oder 2, dadurch 
gekennzeich net da(i die spektrale Anpassungs- 
funktion gema& folgender Gleichung auf den er- 
sten Datensatz angewendet wird: 40 

MFg = 1000(1 -r2) 
bei der MFg ein allgemeiner Anpassungsfaktor ist 
und r ein Korrelatk)nskoeff izient ist der die Auf- 
nahmefahigkeiten fur die erste chemische Ver- 
bindung bei ausgewahlten Wellenlangen mit den 45 
Absorptionsvermogen der zweiten chemischen 
Verbindung bei den gleichen WellenlSngen in Be- 
ziehung setzt. 

4. Verfahren nach Anspruch 1 oder 2, dadurch ge- so 
kennzeichnet da& die spektrale Anpassungs- 

f unktion gemid folgender Gleichung auf den er- 
sten Datensatz und auf das gemittelte Absorpti- 
onsverm5gen angewendet wird: 

MFa = 1000(1 -r2) 55 
bei der MFa ein automatischer Anpassungsfaktor 
ist und r ein Korrelationskoeff izient ist, der die 
einzelnen Absorptionsvermogen einer chemi- 



schen Verbindung bei ausgewahlten Wellenlan- 
gen mit den gemittelten Absorptionsvermogen fur 
die gteiche chemische Verbindung bei den glei- 
chen Wellenlangen in Beziehung setzt 

5. Verfahren nach Anspruch 1 oder 2. dadurch ge- 
kennzeichnet da& die spektrale Anpassungs- 
f unktion gema& folgender Gleichung auf den er- 
sten Datensatz und auf die gemittelten Absorpti- 
onsvermogen angewendet wird: 

MFx = 1000(1 -r2) 
bei der MFx ein Oberkreuzanpassungsfaktor ist 
und r ein Korrelationskoeffizient ist der die ein- 
zelnen Absorptionsvermogen einer der chemi- 
schen Verbindungen bei ausgewahlten Wellen- 
langen mit den gemittelten Absorptionsvermogen 
der anderen chemischen Verbindungen bei den 
gleichen Wellenlangen in Beziehung setzt. 

6. Das Verfahren nach einem der Anspruche 3 bis 
5, dadurch gekennzeich net, da& r gemad folgen- 
der Gleichung verwendet wird: 

. [(Exy) - (ZxXZyynf] 

■ [{Sx2 - (Zx)2/nf){2y2 - (Iy)2/nf}]i« 
bei der x bzw. y die Absorptionsvermogen der er- 
sten und der zweiten chemischen Komponenten 
bei der gleichen Wellenlange sind, oder bei der x 
bzw. y einzelne und gemittelte Absorptionsver- 
mogen fur die gleiche chemische Komponente 
bei der gleichen Wellenlange sind. oder bei der x 
bzw. y die einzelnen Absorptionsvermogen f Or e j- 
ne chemische Komponente und die gemittelten 
AbsorptionsvenmSgen fur die andere chemische 
Komponente bei der gleichen Wellenlange sind. 
bei der £ die Summationsf unktion ist und bei der 
ff die Anzahl der ausgewahlten Wellenlangen ist 

7. Das Verfahren nach einem der Anspruche 1 bis 6, 
gekennzeichnet durch den Schritt des Vort)earfoei- 
tens des ersten Datensatzes nach dem Liefern des 
Datensatzes zu der Verabeitungseinrichtung, bei 
dem der Schritt des Vorbearbeitens des ersten 
Datensatzes folgende Schritte einschlieflt 
Auswahlen eines Abschnitts des Datensatzes; 
und 

Kalibrieren des ausgewahlten Abschnittes. 

8. Das Verfahren nach einem der Anspruche 1 bis 
7, dadurch gekennzeichnet da& der Schritt des 
Schaffens von mindestens einem Anpassungs- 
unterscheidungswert (MTdu,) den Schritt des An- 
wendens, uber die Verarbeitungseinrichtung, ei- 
ner Anpassungsunterscheidungswert-Funktion 
auf die altgemeinen Anpassungsfaktoren ein- 
schlie&t 

9. Das Verfahren nach einem der Anspruche 1 bis 
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Schaffens von mindestens einer Flachen- und 
Hohen-Abweichung (AHdev) den Schritt des An- 
wendens, uber die Verarbeitungseinrichtung, ei- 
ner Flachen- und H6hen-Abweichungs-Funktion 
5 auf die Spitzenflachenabweichungen und die 

Spltzenhohenabweichungen gemafi folgender 
Gleichung einschliel^t 

AHdev = (ARdev + HT^)/2 
bei der AHd„ die FlSchen- und -Hohen-Abwei- 
10 Chung 1st, ARd«v die Spitzenf lachenabweichung, 
und HTdov die Spitzenhdhenabweichung 1st. 



Revendicatlons 



8, dadurch gekennzeichnet, daH der Schritt des 
Schaffens der mindestens einen Retentionszeit- 
abweichung (RTdev) den Schritt des Anwendens, 
uber die Verarbeitungseinrichtung, einer Retentl- 
onszeitabweichungsfunktion auf die Retentions- 
zeiten gemad folgender Gleichung einschlie&t 

RTd„ = IRTi - RT2|/RT„„ 
bei der RTdev die Retentionszeitabwelchung 1st, 
RTi die Mittelung der Retentionszeiten fur die er- 
ste chemlsche Verbindung 1st, RT2 die Mittelung 
der Retentionszeiten fur die zweite chemische 
Verbindung ist, und RT|,m eine Grenzefurdie Ver- 
Snderlichkeit der Retentionszeiten ist 

10. Das Verfahren nach einem der Anspruche 1 bis 15 

9, gekennzeichnet durch folgende Schritte: 
Schaffen mindestens einer Spitzenflachenab- 
weichung (ARdev) durch Anwenden, uber die Ver- 
arbeitungseinrichtung, einer Spitzenflachenab- 
weichungsfunktion auf die Spitzenflachen; und 20 
ferner Unterscheiden der ersten chemischen 
Komponente von der zweiten chemischen Konv 
ponente auf der Basis von mindestens einer Spit- 
zenflachenabweichung; 

bei dem die Spitzenf lachenabweichungsf unktion 25 
gemad folgender Gleichung auf die Spitzenfla- 
chen angewendet wird: 

ARdev = IaR, - ARjI/ARiim 
bei der ARdev die Spitzenflachenabweichung ist, 
ARi die Mittelung der Spitzenflachen fur die erste 30 
chemische Verbindung ist, AR2 die Mittelung der 
Spitzenflichen fur die zweite chemische Veri^in- 
dung ist, und ARum eine Grenze fur die VerSnder- 
lichkeit der Spitzenf lache ist. 

35 

11. Das Verfahren nach einem der Anspruche 1 bis 

10, gekennzeichnet durch folgende Schritte: 
Schaffen von mindestens einer Spitzenhdhenab- 
weichung (HTdev) durch Verwenden, uber die Ver- 
arbeitungseinrichtung, einer Spitzenhdhenab- 40 
weichungsfunktion auf die Spitzenhohen; und 
ferner Unterscheiden der ersten chemischen 
Komponente von der zweiten chemischen Kom- 
ponente auf der Basis von mindestens einer Spit- 
zenhdhenabweichung; 45 
bei der die Spitzenhdhenabweichungsf unktion 
gemdH folgender Gleichung auf die Spitzenhd- 

hen angewendet wird: 

HTdev = IhTi - HT^I/HT,,^ 
bei der HTdev die Spitzenhdhenabweichung ist, 50 
HTi der MIttelwert der Spitzenhohen fur die erste 
chemische Komponente, HT2 der Mittelwert der 
Spitzenhohen fur die zweite chemische Verbin- 
dung, und HTiitn eine Grenze fur die Veranderlich- 
keit der Spitzenhdhe ist. 55 

12. Das Verfahren nach einem der Anspruche 1 bis 

11, dadurch gekennzeichnet. daB der Schritt des 



1 . Proc6d6 pour distinguer un premier compost chi- 
mique d'un second compost chimique sur la 
base de donn6es de chromatographie. lesdits 
compost chimiques absorbant une radiation ul- 
traviolette. comprenant les stapes consistant ^; 

exposer au moins I'un des composes chi- 
miques une ou plusieurs fbis S une ou plusieurs 
longueurs d'onde s^lectionn^es de radiation ul- 
traviolette; 

enregistrer les absorbances respectives 
d'au moins un des composes chimiques lors de 
chaque exposition ^ ladite radiation ultraviolette; 

fournir un premier ensemble de donn^es h 
des moyens de traltement, ledit premier ensenv 
ble de donn^es comprenant les absorit)ances res- 
pectives pour les premier et second composes 
chimiques lors d'une ou plusieurs expositions d 
une ou plusieurs longueurs d'onde s6lectionn6es 
de radiation ultraviolette; et 

fournir au moins un facteurde coincidence 
spectrale en appliquant. via les moyens de tral- 
tement. une fonction de coincidence spectrale au 
premier ensemble de donn^es; 

caract^ris^, en ce qu'il comprend 6gale- 
ment les stapes consistant 

d^duire au moins un discriminateur de 
coTncidence (MTdb) ^ partir d'au moins un des 
facteurs de coTncidence spectrale mentionn^ ci- 
dessus; 

fournir au moins un r^ultat maximum en 
appliquant, via les moyens de traltement, une 
fonction de r6sultat maximum au premier ensenv 
ble de donn^es. la fourniture d'au moins un r^sul- 
tat maximum comprenant les stapes consistant 
k: 

fournir aux moyens de traltement 
des facteurs de poids (f„„ fr. fa) pour les discrimi- 
nateurs de coTncidence, des hearts de temps de 
retention (RTdev). e^ des hearts de surface et de 
hauteur (AHdev). respectivement. le facteur de 
poids (f J pour les discriminateurs de coTnciden- 
ce 6tant sup6rieurau facteurde poids (f^) pour les 
hearts de temps de retention et au facteur de 
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poids (fa) pour les hearts de surface et de hau- 
teur; et 

appliquer la fonction de r^sultat 
maximum aux discriminateurs de coincidence, 
aux hearts de temps de retention, et aux hearts 5 
de surface et de liauteur. conformSment ^ I'^ga- 
tit^: 

PS = ((fmMTdte) + (frRTdev) + (fa AHce.)) / NF 

ou PS est le r^ultat maximum, f^, 
est un facteur de poids pour le discriminateur de io 
coincidence, frOst un facteur de poids pourl'^cart 
de temps de retention, etfa est un facteur de poids 
pour r^cart de surface et de hauteur, et NF est un 
facteur de normalisation d^duit de fa^on empiri- 
que; 15 

distinguer le premier compost chimlque 
du second compost chimique sur la base du 
sultat maximum (PS). 

2. Proc^6 selon la revendication 1, caractdris^ en 20 
ce que les discriminateurs de coincidence (MTdis) 
sont d^duits conformdment k I'^gallt^: 

MTdts = D/T (DP. prob) 
oCi MT(j[5 est le discriminateur de coincidence, D 
est la difference pour le facteur de coincidence 25 
moyen qui est d^duit k partir de fonctions d'auto- 
colncldence et de coincidence crois6e, DP repr6- 
sente les degr6s de liberty qui sont calculus ^ 
partir du nombrede spectres individuels pour les 
premier et second composes chimiques, et T (DP, so 
prob) est la valeur t requise pour un degr^ de pro- 
bability souhaite (prod, en %) pour que deux 
moyen nes qui different de la valeur t soient dif- 
f^rentes 6tant donnd les degr^s de liberty appli- 
cables. 35 



ment aux absorbances moyen nes pour le m§me 
compost chimique aux m§mes longueurs 
d'onde. 

5. Proc^dd selon la revendication 1 ou 2, caract^ri- 
s6 en ce que la fonction de coincidence spectrale 
est appliqu^e aux premier ensemble de donn6es 
et aux absorbances moyennes conform6ment k 
r^galit^: 

MFx = 1000 (1 - r2) 
ou MFx uri facteur d'autocolncidence et r est 
un coefficient de correlation qui concerne les ab- 
sorbances individuelles pour un des composes 
chimiques d des longueurs d'onde selection n^es 
relativement aux absorbances moyennes pour 
Tautre compost chimique aux mdmes longueurs 
d'onde. 

6. Proc6de selon Tune quelconque des revendlca- 
tions 3^5, caracterisd en ce que r est applique 
conformement k regal ite: 

r = [(£xy)-(2:x)(Syyn,]/[(Sx2-(Zx)2/n,)(Zy2- 
(Sy)2/nf)]i'2 

ou X et y, respectivement, sont les absorbances 
des premier et second composes chimiques k la 
mSme longueur d'onde, ou, ou x et y, respective- 
ment, sont des absorbances individuelles et 
moyennees pour le mSme compose chimique k la 
mSme longueur d'onde, ou, ou x et y, respective- 
ment, sont les absorbances individuelles pour un 
compose chimique et des absorbances moyen- 
nees pour Tautre compose chimique k la m§me 
longueur d'onde, et ou I est la fonction de sonv 
me, et nf est le nombre de longueurs d'onde se- 
lect ionnees. 



3. Precede selon la revendication 1 ou 2, caracteri- 
se en ce que la fonction de coincidence spectrale 
est appliquee au premier ensemble de donnees 
conform6ment k regallte: 40 

MPg = 1000 (1 -r2) 
ou MPg est un facteur de coincidence globale et 
r est un coefficient de correlation qui concerne 
les absorbances pour le premier compose chimi- 
que k des longueurs d'onde seiectionnees relati- 45 
vement aux absorbances pour le second compo- 
se chimique aux mSmes longueurs d'onde. 

4. Procede selon la revendication 1 ou 2, caracteri- 

se en ce que la fonction de coincidence spectrale so 
est appliquee au premier ensemble de donnees 
et aux absorbances moyennes conformement k 
regalite: 

MPa = 1000(1 -r2) 
ou MPa est un facteur d'autocolncidence et r est 55 
un coefficient de correlation qui concerne les ab- 
sorbances individuelles d'un compose chimique 
k des longueurs d'onde seiectionnees relative- 



7. Precede selon I'une quelconque des revendica- 
tions 1^6, caracterise par I'etape qui consiste k 
preparer le premier ensemble de donnees apres 
avoir fourni ledit ensemble de donnees aux 
moyens de traitement, dans lequel retape 
conslstant k preparer le premier ensemble de 
donnees comprend retape consistent k: 

seiectionner une partie de I'ensemble de 
donnees; et 

etalonner la partie seiectionnee. 

8. Precede selon I'une quelconque des revendica- 
tion ^ k 7, caracterise en ce que I'etape qui 
consiste k fournir au moins I'un desdits discrimi- 
nateurs de coincidence (MTdb) comprend I'etape 
consistent k appliquer, via les moyens de traite- 
ment, une fonction de discrimination de coinci- 
dence aux facteurs de coincidence globale. 

9. Precede selon I'une quelconque des revendica- 
tiens 1^8, caracterise en ce que I'etape qui 
consiste k fournir au moins Tun desdits ecarts de 
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temps de r6tention (RTdev) comprend r6tape 
consistant ^ appliquer aux temps de retention, via 
les moyens de traitement, unefonction d'6cartde 
temps de retention conforme d t'^galit^: 

RTdov = |RT1-RT2|/RT„„ 5 
oil RTdev est un ^cart de temps de retention. RT^ 
repr^sente les temps de retention moyens pour le 
premier composant chimique, RT2 et la moyenne 
des temps de retention pour le second compost 
chimlque. et RTnm est une limlte de variability 10 
pour les temps de retention. 

10. Proc6d6 selon Tune quelconque des revendica- 
tions 1 d 9, caract^rls^ par les stapes consistant 
k: 15 

fournir au moins un 6cart de surface maxi- 
mum (ARdev) en appliquant, via les moyens de 
traitement, une fonction d'^cart de surface maxi- 
mum aux surfaces maxima; et 

distinguer ^galement le premier compost 20 
chimlque du second compost chimlque sur la 
base d'au moins une 6cart de surface maximum; 

dans lequel la fonction d'^cart de surface 
maximum est appliqu^e aux surfaces maxinna 
conform^ment k I'^gal it6: 25 

AR,e, = IAR1-AR2I/AR,,, 
ou ARdev est r^cart de surface maximum, ARi est 
la nrK)yenne des surfaces maxima pourle premier 
compost chimique, AR2 est la moyenne des sur- 
faces maxima pour le second compost chimlque, 30 
et AR|[m est une limite de variability pour la surfa- 
ce maximum. 



tant d appliquer. via les moyens de traitement, 
une fonction d'^cart de surface et de hauteur aux 
hearts de surface maximum et aux hearts de 
hauteur maximum, conform6ment ^ r6gallt6: 

AHd«, = (AR,^ + HT^)/2 
oil AHdev est un ^cart de surface et de hauteur, 
ARdev est ycart de surface maximum, et HTdev est 
un 6cart de hauteur maximum. 



11. Procydd selon I'une quelconque des revendica- 
tlons 1^10, caractyrisd par les Stapes consistant 35 
k: 

fournir au moins une Scart de hauteur 
maximum (ARdav) en appliquant, via les moyens 
de traitement une fonction d'dcart de hauteur 
maximum aux hauteurs maxima; et 40 

distinguer Sgalement le premier compost 
chimlque du second compost chimlque sur la 
base d'au moins une Scart de hauteur maximum; 

dans lequel la fonction d'Scart de hauteur 
maximum est appliquSe aux hauteurs maxima 45 
confomnyment k regality: 

HTdev = |HTi.HT2l/HT„^ 
oCi HTdevCStrycartde hauteur maximum, HTi est 
la moyenne des hauteurs maxima pourle premier 
compost chimlque, HT2 est la moyenne des hau- so 
teurs maxima pourle second compost chimlque, 
et HTiin, est une limite de variability pour la hau- 
teur maximum. 



12. Procydy selon I'une quelconque des revendica- 55 
tions 1 d 11, caractyrisy en ce que rytape qui 
consiste k fournir au moins un ycart de surface 
et de hauteur (AHd^v) comprend rytape consis- 
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PROGRAM MAKE LIBRARY 



101 



/ USER INPUT 

/RETRIEVE RAW 102 
/ DATA FILE / 
I * 

SELECT SIGNAL~K *^^ 
1 

FIND ALL PEAKS"h ^'Q^ 



»PEAKS =>P h^'OS 




'CREATE FILE FOR / 
SAMPLE LIBRARY /^I06 



I 



INITIALIZE 
COUNTER i=>1 



•107 



FIND APEX SPECTRUM 
FOR PEAK j 



FIND APPROPRIATE 
REFERENCE SPECTRA 



•108 



•109 




TRANSFER SPECTRUM^ 



TO LIBRARY FILE 



113 



TRANSFER PEAK DATA^ 



TO LIBRARY FILE 



114 



i=i + 1 



■115 



^116 



QnD 



DO BACKGROUND 

CORRECTION 
=>PEAK SPECTRUM 



CALIBRATE WAVE- 
LENGTH AXIS 



■110 



rig. IP 



6 

A 
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SUBPROGRAM COMPARE LIBS 



C$TARf> ^ 



200 



201 



/GET USER INPUT f 
I 

/ RETRIEVE /-202 
/ LIBRARY LI / 



DETERMINE # 
OF PEAKS =>P1 



'203 



z 



RETRIEVE y 
LIBRARY L2_/^ 



I 



DETERMINE # 
OF PEAKS=>P2 

I 



CORRECT 
RETENTION TIMES 



I 



NORMALIZE 
AREA AND HEIGHT 

I 



204 
-205 

-206 
-207 



INITIALIZE COUNTERS 
i = >1,k'>0 



—208 



/ RETRIEVE DATA / 
/FOR PEAK i IN LlA 



CALCULATE RET 
TIME WINDOW 



INITIALIZE 
COUNTER j°>1 



209 
210 
-211 



/ RETRIEVE DATA V 
/FOR PEAK i IN Vllr ^lM 



T 



YES/ " 



A 

1. 



213 



PEAK j INSIDE 
RET WINDOW? 



NO 



CALCULATE 
MTdis. CN 



CALCULATE 
DEVIATIONS 



YES 



214 



■215 
'216 



YES/ MATCH BETTER \N0 
\ THAN TABLE ?> C 



REMOVE ENTRY 
10 FROM TABLE 



217 
-218 



k=k + 1 



± 



219 



INSERT MATCH 
INTO TABLE 1-^20 



j=j+1 



NO 



222 
NO 



-221 
YES 



i=i+1 



224 



5^^1pi7> 



-223 
YES 



DO PEAK 
ASSIGNMENT 225 



CALCULATE 
PEAK SCORES 



226 



Cm) 



rig II 
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SUBPROGRAM MAKE STO LIBRARY 



USER INPUT 7 
/#0F LlBRARIES° >LA.^Qi 

/COPY LIBRARY \ / 
/ TO TEMP / — 30? 



INITIALIZE 
COUNTER i°>? 
I 



■303 



NO 



304 



COMPARE LIBS I 
LIB i<>TE MP 

-306 
^307 



DELETE EXTRA 
PEAKS IN LIBi 



INITIALIZE 
COUNTER j'>i-1 



NO 



308 



>0? 



JES 




DELETE PEAKS 
FROM LIB j - 



-309 



j=j+1 



DELETE PEAKS 
FROM TEMP • 



-310 



-311 
-31? 



COUNT PEAKS IN 1/^'^ 



TEMP=>P 



CREATE 



ILE FOR 



314 
1/ 



STANDA RD LIBRARY 
I 



/ INITIALIZE / 
/COUNTER i=>1/ ^3l5 



/STORE SPECTRUM i FOR / 
/ ALL UBS TO STDLIBA 3I6 



STORE AVERAGE SPECTRUN 



10 STDLIB /^3i7 



A3 



/STORE INFO FOR PEAK i / 
/FOR ALL LIBS TO STD LIBA 
^ ^318 



/STORE AVERAGE PEAKV 
/ INFO TO STDLIB A 



CALCULATE Ma 
FOR ALL SPECTRA 



319 
3?0 



/STORE Ma / 
/TO STDLIB A 



i = i+1 



NO 



3?l 
•3?? 



< i>P? > 



YES 



3?3 



Cm) 



rig. 12 
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MAIN PROGRAM GET SAMPLE SCORE 



( START 



400 



/GET USER INPUT / ^40l 



YES/SIANDARD LIBRARY SI FOR 
\ R1 REPLICATES AVA ILABLE 



-^402 



, INPUT* OF 7 
'REPLICATES =>R1/^ 



403 



INITIALIZE I 
COUNTER i°>1 — 404 



/ RETRIEVE RAW 
/DATA FOR SAMPLE/ " 



405 



MAKE LIBRARY i i -^y|Qg 



J 



i4lU--407 
NO/ ■ j.- vYES 



408 



' ' 



MAKE STD . ,no 
LIBRARY S1"t~409 



STANDARD LIBRARY S2 FOR \ 
R2 REPLICATES AVAILABLF?/ ~ 

1 410 



/ INPUT* OF / 
/REPUCATES°>R2/ -^ 



INITIALIZE 
COUNTER i =>1 - 



T 



411 
412 



i 



413 
L 



/Btrieve rawTV 
/data for samplej/ 



MAKE LI BRARY j 

^415 



414 



NO 



j=J+1 
T 



416 



YES 



MAKE STD , 
LIBRARY S2~4I^ 



COMPARE LIBS 
S1<>S2 ' 



•418 



OUTPUT 7 
^PEAK SCORES A4I9 



CALCULATE 
SAMPLE SCORE ^ 



/ OUTPUT 7 
/SAMPLE SCORE A 421 



/OUTPUT FINAL REPORT/ 



CenD 



422 
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